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Abstract 

We improve the inequality used in (Pronzato, 2003) to remove points from the 
design space during the search for a D-optimum design. Let £ be any design on 
a compact space X C M m with a nonsingular information matrix, and let m + e 
be the maximum of the variance function <i(£, x) over all x S X. We prove that 
any support point x* of a D-optimum design on X must satisfy the inequality 
d(£,x*) > m(l + e/2 - ^6(4 + e - 4/m)/2). We show that this new lower bound 
on x#) is, in a sense, the best possible, and how it can be used to accelerate 
algorithms for D-optimum design. 
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1 Introduction 



Let X C M. m be a compact design space and let S be the set of all designs 
(i.e., finitely supported probability measures) on X. For any £ e S, let 



M(0 = | xx T £(dx) 



denote the information matrix. Suppose that there exists a design with nonsin- 
gular information matrix and let H + be the set of such designs. Let £* denote 
a D-optimum design , that is, a measure in H that maximizes detM(£), see, 



e.g., (IFedorovl . I19721 ). Note that a .D-optimum design always exists and that 
the D-optimum information matrix = M(£*) is unique. For any £ G S + 
denote •) : A? — ► [0, oo) the variance function defined by 



^,x)=x T M- 1 (e)x 



The celebrated Kiefer-Wolfowitz Equivalence Theorem (1960) writes as fol- 
lows. 

Theorem 1 The following three statements are equivalent: 

(i) £* is D -optimum; 

(ii) max xeA -(i(£*,x) = m; 

(Hi) £* minimizes max x6 # x) ; ( 6 H + . 

Notice that 

J = y x T M; 1 xr(^x) = trace^M; 1 ) =m. 

X X 

Hence, (ii) of Theorem [1] implies that for any support point x* of the design 
(i.e., for a point satisfying £*(x*) > 0), we have 

d(r,x*) = m. (1) 



In the next section we show that the equality (pQ) can be used to prove that 

V£e ~+, d(£,x*) > mA*(£) 

where depends on £ only via the maximum of •) over the design space 
X. Hence, we can test candidate support points by using any finite number of 
design measures £ G H + , e.g., those that are generated by a design algorithm 
on its way towards the optimum: any point that does not pass the test defined 
by £ fc of iteration k need not be considered for further investigations and can 
thus be removed from the design space. 
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2 A necessary condition for candidate support points 



For £ a design in S + denote M = M(£), 

H = M- 1/2 M,M~ 1/2 

and Ai < A2 < • • ■ < A m the eigenvalues of H. Notice that Ai > and 
that the eigenvalues depend on the design £ as well as on the D-optimum 
information matrix M*. Let x* be a support point of a D-optimum design 
and let y* = H^M^x*. The equality Q can be written in the form 
y*Ty* = 171 which implies: 

d{£, x,) = xjM^x* = yjHy, > Aiyjy* = mA x . (2) 

To be able to use the inequality fl2]), we need to derive a lower bound A * on 
Ai that does not depend on the unknown matrix M*. 

Theorem [TJ- (ii) implies 



£,K 1 = trace(H -1 ) 

i=i 

= trace(M; 1 M) = j x T M; J x £(dx) = J d(C,x) £( rfx ) < m - 
x x 

Also, 

m 

Aj = trace(H) 

t=i 

= trace(M^M~ 1 ) = / x T M _1 x f*(dx) < maxx T M _1 x = m + e , 

J x-£X 
X 

where we used the notation 

e = e (£) = maxx T M _1 x-m > 0. (3) 

-X.&X 



For m = 1 we directly obtain the lower bound \%> \\ — 1. For m > 1, the 
Lagrangian for the minimisation of Ai subject to 1 A.^ 1 < m and X™ x Aj < 
m + e is given by 

(m \ / m \ 

$3 A," 1 - mj + // 2 w2 \~™>- e J 
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with A = (Ai, A m ) T , fix, fi2 > 0. The stationarity of £(A, fix, ^2) with respect 
to the Aj's and the Kuhn- Tucker conditions 

(m \ / m \ 

^2 K 1 - m J = 0) A*2 ( Ai - m - ej =0 

give Aj = L for i = 2, m, with Ai and L satisfying 



Af 1 + (m- 1)L 



-1 



Ai + (m — 1)L = m + e 
The solution is thus 



e Je 4 + e - 4/m 

1 + - - — - < 1 

2 2 - 



(4) 



and A* = L* = (m — l)/(m — 1/A*) > 1, z = 2, ...,m. Notice that the bound 
(jlj) gives A J = 1 when m = 1 and can thus be used for any dimension m > 1. 
By substituting AJ for Ai in (T5]) we obtain the following result. 



Theorem 2 For any design £ G S + , any pomt x* G X such that 
d(£,x*) < h m (e) 



m 



1+t 2 



e(4 + e - 4/m) 



(5) 



where e = max xe ^ d(£, x) — m, cannot be a support point of a D-optimum 
design measure. 



The inequality in (jPronzatol . 120031 ) uses 



h m (e) = m 



e__ V e(4 + e) 
2 2 



(6) 



Notice, that m > h m (e) > h m (e) for all integer m > 1 and all e > 0, and 
that lim^oo h m (e) = 1 while lim^oo h m (e) = 0. The new bound is thus always 
stronger, especially for large values of e, i.e. when the design £ is far from being 
optimum. Although in practice the improvement over (jH]) can be marginal, see 
the example below, the important result here is that the bound (ISD cannot be 
improved. Indeed, when m = 1, hi(e) = 1 for any e > which is clearly the 
best possible bound. When m > 2, h m (e) is the tightest lower bound on the 
variance function d(£,x„,) at a D-optimal support point x* that depends only 
on m and e, in the sense of the following theorem. 
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Theorem 3 For any integer m > 2 and any e, 5 > there exist a compact 
design space X C W 71 , a design £ on X and a point x* G X supporting a 
D-optimum design on X such that e = max xe ^ <i(£, x) — m and 

<2(£,x*) < h m (e) + 5 . 

Proof. Denote h = h m (e) and k = 2 m_1 . Let Xi,...,Xj. correspond to the k 
vectors of IR m of the form 




h-1 



him — 1) ' ' y h(m 



and let yi, . . . , yfc correspond to the k vectors (Jl/m, ±\Jl/m, . . . , ±\Jl/rnJ . 

Take x* = (Vb, 0, . . . , 0) T G W m with 1 < b < min{(e + m)/h, (h + S)/h}, 
X as the finite set X = {xi, X&, yi, y^, x*} and let £ be the uniform 
probability measure on Xi,...,Xfc. Note that M(£) is a diagonal matrix with 
diagonal elements (l/h, {h — l)/[h{m — 1)], . . . , {h — l)/[h(m — 1)]). One can 
easily verify that 

maxx T M _1 (()x-m = e and d(£,x*) = x7M _1 (^)x» = bh < h m (e) + 5. 

The uniform probability measure rj on yi, y& is D-optimum on A'/jx*}, as 
can be directly verified by checking (ii) of the Equivalence Theorem [TJ On the 
other hand, r) is not D-optimum on X since xjM^r^x* = bm > m, which 
implies that x* must support a D-optimum design on X. ■ 

Example: We consider a series of problems defined by the construction of the 
minimum covering ellipse for an initial set of 1000 random points in the plane, 
i.i.d. A/"(0, 12). These problems corr espond to D- o ptimum des ign problems for 



randomly generated X C M , see iTitteringtonl (119751 . Il978l ). The following 
recursion can thus be used: 

w* +1 = wt^il, i=l,. ..,,(*), (7) 
m 



where k > 0, wj? = £ fc (xj) is the weight given by the discrete design £ k to 
the point Xj and q(k) is the cardinality of X at iteration k. In the original 
algorithm, q{k) = q(0) for all k and, initialized at a £° that gives a positive 
weight at ea ch point of X, the al gorithm converges m onotonically to the op- 
timum, see ( Torsneyl . 1983 ) and ( Titterington . 1976 ). The tests §5§ and ([HD 



can be used to decrease q{k): at iteration k, any design point x,,- satisfying 
d(£ fc , Xj ) < h m [e(e)}, see PEJ, or d(£ fc ,x,) < h m [e{^ k )], see p E), can be 
removed from X. The total weight of the points that are cancelled is then 
reallocated to the Xj's that stay in X (e.g., proportionally to w k ). 
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Figure [T] presents a typical evolution of q(k) as a function of log(fc) for £° 
uniform on X and shows the superiority of the test ([5]) over (J6]). The improve- 
ment is especially important in the first iterations, when the design £ fc is far 
from the optimum. Define k*(S) as the number of iterations required to reach 
a given precision S, 



k*(5) = min{k > : e(£ fc ) < 5} , 



with e(£ fc ) defined by ([3]). Notice that from the concavity of logdetM(£) we 
have 



, , , , dlogdetM[(l - a)e k ' (S) + aPl 
log det M (* - log det M (£ k (5) ) < - 1 ±— — J - ^ J 

oa |o=o 



J c/(e fc * (<5) ,x) e*(dx)-m< 5. 



x 

Table [1] shows the influence on the algorithm (JTj) of the cancellation of points 
based on the tests (jSJ) and ([6]), in terms of k*(S), of the corresponding com- 
puting time T(5), the number of support points n(5) of £ k *( s ^ and the first 
iteration kio when has 10 support points or less, with 5 = 10~ 3 . The results 
are averaged over 1000 independent problems. The values of k*(8) and k w are 
rounded to the nearest larger integer, the computing time for the algorithm 
with the cancellation of points based on (jSJ) is taken as reference and set to 1 
(the algorithm without cancellation was at least 4.5 times slower in all the 1000 
repetitions). Although cancelling points has little influence on the number of 
iterations k*(S), is renders the iterations simpler: on average the introduction 
of the test (0) in the algorithm (JTj) makes it about 30 times faster. 




log(k) 

Fig. 1. q{k) as a function of log(fc): cancellation based on ([5]) in solid line, on © in 
dashed line. 

The influence of the cancellation on the performance of the algorithm can be 
further improved as follows. Let (kj)j denote the subsequence corresponding 
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Algorithm 


k (d) 


i (dj 


n(d) 






252 


31.6 


1000 




© and © 


248 


1.4 


5.8 


82 


© and © 


247 


1 


5.5 


66 



Table 1 

Influence of the tests (J5]) and (jSJ) on the average performance of the algorithm ([7]) 
for the minimum covering ellipse problem (1000 repetitions, 6 = 10~ 3 ). 

to the iterations where some points are removed from X. We have j < q(0), 
the cardinality of the initial X, and the convergence of the algorithm (I7j) is 
therefore maintained whatever the heuristic rule used at the iterations kj for 
updating the weights of the points that stay in X (provided these weights re- 
main strictly positive). The following one has been found particularly efficient 
on a series of examples: for all t G Tj, the set of indices corresponding to the 
points that stay in X at iteration kj, replace by 

/k Zt \Awt j if d(£ fej \x t ) > m 

w t J = — where z t = < 

Z^sgTj z s [Wt 1 otherwise 

for some A > 1. A final remark is that by including the test (jSJ) in the algorithm 
((?]) one can in general quickly identify potential support points for an optimum 
design. When the number n of these points is small enough, switching to a 
more standard convex-programming algorithm for the optimization of the n 
associated weights might then form a very efficient strategy. 
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